Colorectal cancer (CRC) is the third most common cancer globally.1Ferlay J. et al.https://www.who.int/news-room/fact-sheets/detail/cancerhttps://gco.iarc.fr/todayGoogle Scholar When diagnosed early, the 5-year survival rate is 92%,2Cancer Research UK https://www.cancerresearchuk.org/health-professional/cancer-statistics/statistics-by-cancer-type/bowel-cancer/survival#heading-ThreeGoogle Scholar yet 23% of CRCs are diagnosed at an advanced stage3Cancer Research UK https://crukcancerintelligence.shinyapps.io/EarlyDiagnosis/Google Scholar in the United Kingdom with a 5-year survival rate of 10%.2Cancer Research UK https://www.cancerresearchuk.org/health-professional/cancer-statistics/statistics-by-cancer-type/bowel-cancer/survival#heading-ThreeGoogle Scholar Early CRC has symptoms that are shared with common benign conditions.4Adelstein B.-A. et al.BMC Gastroenterol. 2011; 11: 65Crossref PubMed Scopus (100) Google Scholar Colonoscopy capacity is limited, and referring all symptomatic patients for colonoscopy would overwhelm available resources. An intermediate triage test to identify patients at risk of CRC could streamline referral pathways. A breath test based on detecting volatile organic compounds (VOCs) has the ideal characteristics for a triage tool because it is noninvasive, simple to undertake, and acceptable to patients of all ethnicities.5Woodfield G. et al.BMJ Open. 2021; 11e044691Crossref PubMed Scopus (4) Google Scholar COBRA1 is a prospective, multicenter diagnostic study aimed to develop a breath test to detect CRC (Research Ethics Committee no. 17/EE01/12; Bowel Cancer Screening Program identification 189; clinicaltrials.gov identifier NCT03699163). The target sample size was 1463 patients (117 CRC, 1346 control subjects) using the power diagnostic test function from the MKmisc R package, considering type I error (alpha) of 0.05, power (1-beta) of 0.8, prevalence value of 0.08, and assumed difference of 0.1. The study recruited patients aged 18–90 years attending 7 London hospitals (Appendix 1) for a bowel cancer screening program colonoscopy because of a positive fecal occult blood test (n = 664), colonoscopy for other indications (n = 645), or surgical colorectal adenocarcinoma resection (n = 123) (June 2017 to February 2020). We recruited a mixed population to ensure an adequate number of cancer cases for model development. The reference test was colonoscopy ± histopathology. Exclusion criteria was concurrent chemotherapy. Breath was collected by trained research nurses using the ReCIVA breath sampling device (Owlstone Medical Ltd, Cambridge, UK) onto thermal desorption tubes using standardized settings immediately before colonoscopy and surgery.6Doran S.L.F. et al.J Breath Res. 2017; 12016007Crossref PubMed Scopus (39) Google Scholar Patients fasted for 4 hours before breath collection. Quality control measures were performed for breath collection and analysis (Appendix 2). Thermal desorption tubes were couriered to the VOC laboratory (Imperial College London) for same-day analysis using gas chromatography-mass spectrometry (Agilent Technologies, Cheshire, UK) equipped with a midpolar column (ZB-62,460 m × 0.25 mm inner diameter × 1.40 μm df; Phenomenex Inc, Torrance, CA). When same-day analysis was not possible, thermal desorption tubes were stored at –80oC for subsequent analysis. Laboratory staff were blinded to disease status. Gas chromatography-mass spectrometry data were acquired using MassHunter software (B.07 SP1; Agilent Technologies) and processed using the custom-designed spectral deconvolution tool MSHub.7Aksenov A.A. et al.Nat Biotechnol. 2021; 39: 169-173Crossref PubMed Scopus (42) Google Scholar Machine learning pipelines identified predictive features (VOCs and clinical metadata) to develop a multivariate discriminant analysis model and receiver-operating characteristic (ROC) curves (Appendix 3). Included in the analysis were 1432 patients (828 men), with a median age of 66.5 years (range, 18–90). No adverse events were reported. Of 1432 patients, 357 had a normal colonoscopy; 188 had benign pathology (hemorrhoids or diverticular disease); 106 had inflammatory bowel disease; 348, 67, and 204 patients had low-, intermediate- and high-risk polyps, respectively; and 162 had colorectal adenocarcinoma. Polyp risk stratification was based on 2002 British Society of Gastroenterology (BSG) polyp surveillance guidelines,8Atkin W.S. et al.Gut. 2002; 51: v6-v9Crossref PubMed Scopus (261) Google Scholar adapted to include dysplasia status, and the BSG 2017 guidance on serrated polyps.9East J.E. Atkin W.S. Bateman A.C. et al.Gut. 2017; 66: 1181-1196Crossref PubMed Scopus (158) Google Scholar Patient demographics, cancer characteristics, and exclusion details are presented in Supplementary Table 1. CRC patients were older than control subjects and more likely to have had previous CRC or heart disease or used laxatives, antibiotics, or anticoagulants. CRCs were recruited from surgical lists (n = 119), bowel cancer screening programs (n = 30), and other colonoscopy lists (n = 13). Of the CRCs, 64.2% were T3 or T4. Of 1432 patients, 855 reported at least 1 symptom at the time of breath sampling. Across all breath samples, 1024 VOC product ions were detected. The top 99 most predictive features (97 VOCs, body mass index, and age) according to Random Forest Scores were annotated and identified using mass spectral libraries (NIST, version 2.0).10Shen VK, et al. NIST Standard Reference Simulation Website. https://www.nist.gov/programs-projects/nist-standard-reference-simulation-website.Google Scholar We assessed the origin of VOCs using the Human Metabolome Database 2018.11Wishart D.S. et al.Nucleic Acids Res. 2018; 46: D608-D617Crossref PubMed Scopus (2036) Google Scholar Thirty-five VOCs were deemed likely to be exogenous and 37 were of unknown identity, leaving 25 endogenous VOCs for further statistical analysis (Supplementary Table 2). A diagnostic model comparing all CRC (n = 162) and non-CRC patients (n = 1270) based on 14 endogenous VOCs and body mass index predicted CRC with area under the ROC curve of 0.87, sensitivity of 79%, specificity of 86%, and negative predictive value of 97% (Figure 1A). A model using data from symptomatic patients only, taken from the same cohort (CRC, n = 146; non-CRC, n = 709), predicted CRC with an area under the ROC curve of 0.91, sensitivity of 83%, specificity of 88%, and negative predictive value of 96% (Figure 1B). Predictive VOCs were largely from the alkane-, alcohol-, ester-, and sulfur-containing chemical groups. Higher levels of dimethyl sulfide and 2-ethoxypropane discriminated right-sided (cecum to transverse colon) from left-sided tumors (P = .002 and P = .045, respectively). Polyps of any risk category (n = 619) could be predicted with an area under the ROC curve of 0.67, sensitivity of 66%, and specificity of 58% when compared with patients with no polyps and no CRC (n = 651) based on 16 endogenous VOCs and age. A model based on high-risk polyps did not improve prediction. An acknowledged limitation of this study is the use of a selected population including patients known to have CRC to recruit enough cancer cases for model development, resulting in differences in age, medication use, and bowel preparation. It precluded symptom-based analysis (although this was not a study aim). Bowel preparation and age were not predictive features or confounding factors in the VOC-based model. Use of medications could not be examined in the machine learning analysis. COBRA1 achieved its aim of constructing a diagnostic model of a VOC-based breath test to detect CRC, with promising results. The high negative predictive value of the breath test suggests the possibility of use as a triage tool. The strength of this study lies in its multicenter design, large sample size, comprehensive quality control measures, and selecting only endogenous VOCs for model development. The feasibility of multicenter breath collection with centralized sample analysis is also demonstrated. These results support further evaluation of this technology for detecting CRC in an unselected screening-eligible population for CRC screening, either alone or in combination with other tests, such as the fecal immunochemical test. We thank the National Institute for Health Research–affiliated breath testing teams at St Mark’s, Charing Cross, St George’s, Homerton, West Middlesex, Chelsea and Westminster, and St Mary’s Hospitals for their patient recruitment and sample collection across 7 sites for this study. The full study protocol has been provided to the journal as a separate file and can be provided on request. Individual deidentified participant data will not be shared but are available on request to authors. The COBRA1 Working Group includes Piers R. Boshier,1 GengPing Lin,1 Antonis Myridakis,1 Oscar Ayrton,1 Patrik Španěl,1,2 Alberto Vidal-Diez,1 Andrea Romano,1 John Martin,3 Laura Marelli,4 Chris Groves,5 Kevin Monahan,1,6 Christos Kontovounisios,1,7 Brian P. Saunders,1,8 from the 1Department of Surgery and Cancer, Imperial College London, London, United Kingdom; 2J. Heyrovský Institute of Physical Chemistry of the Czech Academy of Sciences, Prague, Czech Republic; 3Department of Gastroenterology, Charing Cross Hospital, London, United Kingdom; 4Department of Gastroenterology, Homerton University Hospital, London, United Kingdom; 5Department of Gastroenterology, St George’s Hospital, London, United Kingdom; 6Department of Gastroenterology, West Middlesex University Hospital, London, United Kingdom; 7Department of Surgery, Chelsea and Westminster Hospital, London, United Kingdom; and 8Department of Gastroenterology, St Mark’s Hospital and Academic Institute, London, United Kingdom. Georgia Woodfield, PhD (Conceptualization: Equal; Data curation: Equal; Formal analysis: Supporting; Funding acquisition: Equal; Investigation: Equal; Methodology: Equal; Project administration: Equal; Writing – original draft: Equal; Writing – review & editing: Equal). Ilaria Belluomo, PhD (Conceptualization: Equal; Data curation: Equal; Formal analysis: Equal; Funding acquisition: Supporting; Investigation: Supporting; Methodology: Supporting; Project administration: Supporting; Resources: Supporting; Supervision: Equal; Writing – original draft: Supporting; Writing – review & editing: Equal). Piers R Boshier, PhD (Conceptualization: Supporting; Formal analysis: Supporting; Methodology: Supporting; Project administration: Supporting; Resources: Supporting; Supervision: Supporting; Writing – original draft: Supporting; Writing – review & editing:Equal). GengPing Lin, MD (Formal analysis: Supporting; Investigation: Equal; Methodology: Supporting; Project administration: Equal; Writing – review & editing: Supporting). Antonis Myridakis, PhD (Data curation: Supporting; Formal analysis: Supporting; Investigation: Supporting; Software: Supporting; Supervision: Supporting; Visualization: Supporting; Writing – review & editing: Supporting). Oscar Ayrton, MSc (Investigation: Supporting; Methodology: Supporting; Project administration: Supporting; Resources: Supporting; Software: Supporting; Writing –review & editing: Supporting). Ivan Laponogov, PhD (Data curation: Supporting; Formal analysis: Lead; Methodology: Supporting; Software: Lead; Supervision: Supporting; Validation: Lead; Writing – review & editing: Supporting). Kirill Veselkov, PhD (Data curation: Supporting; Formal analysis: Lead; Methodology: Supporting; Software: Lead; Supervision: Supporting; Validation: Supporting; Writing –review & editing: Supporting). Patrik Španěl, Dr. rer. nat (Data curation: Equal; Formal analysis: Equal; Investigation: Supporting; Methodology: Supporting; Resources: Supporting; Supervision: Supporting; Writing – review & editing: Supporting). Alberto Vidal-Diez, PhD (Data curation: Supporting; Formal analysis: Supporting; Methodology: Supporting; Software: Supporting; Writing – review & editing: Supporting). Andrea Romano, PhD (Formal analysis: Supporting; Investigation: Supporting; Software: Supporting; Writing – review & editing: Supporting). John Martin, MD (Project administration: Supporting; Resources: Supporting; Supervision: Supporting; Writing – review & editing: Supporting). Laura Marelli, MD (Project administration: Supporting; Resources: Supporting; Supervision: Supporting; Writing – review & editing: Supporting). Chris Groves, MD (Project administration: Supporting; Resources: Supporting; Supervision: Supporting; Writing – review & editing: Supporting). Kevin Monahan, PhD (Project administration: Supporting; Resources: Supporting; Supervision: Supporting; Writing – review & editing: Supporting). Christos Kontovounisios, PhD (Project administration: Supporting; Resources: Supporting; Supervision: Supporting; Writing – review & editing: Supporting). Brian P Saunders, MD (Supervision: Supporting; Visualization: Supporting; Writing –original draft: Supporting; Writing – review & editing: Supporting). Amanda J Cross, PhD (Conceptualization: Supporting; Formal analysis: Supporting; Methodology: Supporting; Resources: Supporting; Supervision: Supporting; Writing – original draft: Supporting; Writing – review & editing: Equal). George B Hanna, PhD (Conceptualization: Lead; Data curation: Supporting; Formal analysis: Equal; Funding acquisition: Lead; Investigation: Equal; Methodology: Lead; Project administration: Supporting; Resources: Equal; Software: Equal; Supervision: Lead; Writing – original draft: Supporting; Writing – review & editing: Equal). Patients attending for colonoscopy were recruited from Charing Cross, St Mark’s, St George’s and Homerton Hospitals in London. Surgical patients were recruited from St Mary’s, West Middlesex and Chelsea, and Westminster Hospitals in London. Breath samples were obtained on the same day before colonoscopy or elective surgery. Polyps were stratified into low-risk (1–2 subcentimeter tubular adenomas with low-grade dysplasia or subcentimeter serrated polyps without dysplasia), intermediate-risk (3–4 subcentimeter or one >1-cm tubular adenoma with low-grade dysplasia or >1-cm serrated polyps without dysplasia), and high-risk (≥5 subcentimeter adenomas, ≥3 adenomas if 1 was >1 cm, or any adenoma with high-grade dysplasia or villous change or any serrated polyp with dysplasia) categories. After conditioning (TC20; Markes Ltd, Llantrisant, UK), 30 tubes were randomly selected and assessed using proton transfer reaction time of flight mass spectrometry. If any VOC abundance was >1 ppb, tubes were rechecked and excluded if concentrations remained >1.5 ppb. We assessed VOC contamination in the CASPER filtration system (Owlstone Medical) every 3 months and replaced the filter if contamination was detected or after 450 hours of use; the flow rate by each pump every 4 weeks using a flowmeter to test the flow through an empty thermal desorption (TD) tube attached to each of tube inlets to ensure the intended 200 mL/min flow rate for each tube position; and recorded breath collection parameters by ReCIVA software for all breath samples (temperature, flow rates of both pumps, and pressure of the facemask against the patient’s face as a surrogate for a good seal). We interrogated the h5 files generated by the ReCIVA software using an in-house generated script written with R programming language11Wishart D.S. et al.Nucleic Acids Res. 2018; 46: D608-D617Crossref PubMed Scopus (2036) Google Scholar to create a graphic representation of quality control parameters. We visually assessed all outputs to exclude inadequate samples. We used a threshold-based system to quantify acetone (m/z = 58, RT = 8.97) within each TD tube using gas chromatography-mass spectroscopy, indicating the presence of enough breath sample for analysis. Acetone was selected as the reference compound because it is always present in human breath.2de Lacy Costello B. et al.J Breath Res. 2014; 8014001Crossref Scopus (633) Google Scholar Samples with an acetone abundance of <4,000,000 area (raw gas chromatography-mass spectroscopy data) were excluded. We identified this threshold by comparing 121 breath samples (500 mL) collected using identical protocol and analytical parameters as COBRA1 against 152 nonbiologic control subjects: empty conditioned TD tubes; 500-mL room air samples collected onto TD tubes using ReCIVA, following a similar procedure to that for patient breath; and TD tubes, previously conditioned and then loaded with a standard mixture not containing any of the tested compounds. All but 5 of 121 breath samples analyzed had a breath acetone level above the identified threshold (Mann-Whitney U test, P < .0001). We used retention time, peak shape, and peak area to assess consistency and accuracy of instrument analysis of 5 TD tubes loaded with a certified standard mixture using a permeation unit (ES 4050P; Eco Scientific, Gloucestershire UK).3Romano A. et al.Analytic Chem. 2018; 90: 10204-10210Crossref PubMed Scopus (19) Google Scholar The standard mixture consisted of benzene (63 ppb), phenol (90 ppb), butyric acid (20 ppb), pentanoic acid (5 ppb), hexanoic acid (5 ppb), decanal (4 ppb), and butanal (5 ppb), maintained at 30°C and with a nitrogen flow of 0.9 L/min. A machine learning pipeline using Python1Rossum GV, et al. Python 3 reference manual. Scotts Valley, CA: CreateSpace; 2009.Google Scholar and Scikit-learn2Fabian Pedregosa G.V. et al.J Mach Learn Res. 2011; 12: 2825-2830Google Scholar processed all data (up to 1024 chemical ions per breath sample and metadata for each patient) and identified discriminatory “features” of CRC and non-CRC using both analysis of variance and random forest–based scores. The data were normalized, variance stabilized, and log transformed as part of the machine learning pipeline. Random forest, alphanet, support vector machine (SVM), lasso, and elastic machine learning prediction methods were used independently to compare every combination and permutation of pathology group. The same analyses were repeated for patients aged 40–59, 45–65, 50–69, and 70–89 years and all ages together to investigate whether age was confounding VOC data. The prediction models considered patient-related factors between groups (age, number of hours of fasting, body mass index, ethnic origin, gender, smoking status, weekly alcohol consumption, type of bowel preparation taken before colonoscopy/surgical resection, and family history of CRC) and sampling-related factors (storage time of the TD tube from conditioning to breath sampling, storage time after-sampling until mass spectrometry analysis, and number of days the TD tube was stored in the freezer [if applicable]). Medications were not inputs into the model, because answers were too numerous and heterogeneous; if every answer was coded, “missing data” would have been very common and would have introduced bias, whereas only sporadic mention of a variable would have added unnecessary noise to the dataset and not have been useful. ROC curves were used to determine the accuracy of the diagnostic test in classifying those with and without colorectal disease. The ROC curves were generated based on 25 runs: 5 repeats of 5-fold stratified K-fold splits with reshuffling between splits. This meant that samples were shuffled and then split into 5 groups. Each group was then used in turn as a test set, whereas the other 4 were the training set. Feature selection and model building (machine learning) were performed on a training set each time (80% of the data) and then applied to the test set (20% of the data) to produce the statistics. This was repeated 5 times, and then the results from different runs were averaged to get ROC curves and error estimates. This way avoided overfitting of the model and any bias of an “unlucky” split if the data happened to have been split in a nonrepresentative way. Features were not weighted in any way but filtered to leave the most discriminating features in each run. If a feature was independently selected to be a differentiating feature regardless of how the data was split, the selection score would be higher. A higher score therefore meant that the feature in question was more likely to be a true feature differentiating CRC and non-CRC.Supplementary Table 1Detailed Patient Demographics of CRC and Control GroupsParameterCRC patients only (n = 162)All control subjects (non-CRC)(n = 1270)P between CRC and control patientsNo. of patients approached for study2334 (166 from theatre, 2168 from endoscopy)No. of patients declined to participate73 (17 from theatre, 56 from endoscopy)No. of patients consented but not breath sampled (research nurse judged there was insufficient time preprocedure)406 (10 from theatre, 396 from endoscopy)No. of patients who had breath sampled1855 (139 from theatre, 1716 from endoscopy)No. of patients excluded due to inadequate reference test (incomplete or canceled colonoscopy)65No. of patients excluded before laboratory analysis due to inadequate breath sample (wrong settings used, illegible labeling, human error: tubes not sealed properly)61No. of patients excluded due to failure of ReCIVA quality control (poor breath trace)50No. of patients excluded as samples lost during gas chromatography-mass spectroscopy analysis (human error, instrument fault)81No. of patients excluded due to failure of quality control for VOC presence in TD tubes166No. of patients with high-quality breath samples and reference test and therefore included in the final analysis1432Gender Female61 (37.7)538 (42.4).360 Male101 (62.3)727 (57.2) Unrecorded05 (0.4)Age, y Median (IQR)66.5 (17)63 (14)<.001 Minimum to maximum30–9018–87 Unrecorded07Body mass index, kg/m2 Median (IQR)26 (7)26 (8).674 Minimum to maximum18–4114–48 Unrecorded36661Ethnicity Arab5 (3.1)40 (3.1).568 Asian/Asian British22 (13.6)198 (15.6) Black/African/Caribbean/Black British19 (11.7)106 (8.3) Other ethnicity8 (4.9)38 (3.0) White British/European104 (64.2)858 (67.6) Unrecorded4 (2.5)30 (2.4)Smoking status Current14 (8.6)171 (13.5).080 Ex45 (27.8)397 (31.3) Never102(63)679 (53.5) Unrecorded1 (0.6)23 (1.8)Alcohol intake status Current88 (54.4)802 (63.1).107 Ex11 (6.8)62 (4.9) Never61 (37.7)380 (29.9) Unrecorded2 (1.2)26 (2)Bowel preparation Total patients receiving bowel preparation124 (76.5)1263 (99.4)<.001 No bowel preparation37 (22.8)3 (0.2) Unrecorded1 (0.6)4 (0.3)Time fasted, h Minimum47<.001 Maximum3672 Median (IQR)22 (10)24 (4) Unrecorded46189Reason for attendance Colonoscopy: CRC screening30 (18.5)634 (49.9)<.001 (explained by deliberate enrichment of CRC patients from theatre) Colonoscopy: rescope for polyp removal1 (0.6)26 (2.0) Colonoscopy: surveillance inflammatory bowel disease/polyps/family history3 (1.8)241 (19.0) Colonoscopy: symptoms, 2-week wait5 (3.1)91 (7.2) Colonoscopy: symptoms, urgent4 (2.5)123 (9.7) Colonoscopy: symptoms, routine0147 (11.6) Theatre resection patient for likely CRC119 (73.5)4 (0.3) Unknown reason0 (0)4 (0.3)Past medical history Previous CRC13 (8.0)33 (2.6)<.001 Previous cancer excluding CRC14 (8.6)80 (6.3)aFourteen of these patients had >1 previous cancer..257 Celiac disease1 (0.6)2 (0.2).228 Past bowel resection5 (3.1)44 (3.5).803 Barrett’s esophagus06 (0.5).381 High blood pressure57 (35.2)434 (34.2).798 Known heart disease28 (17.3)132 (10.4).009 Diabetes31 (19.1)186 (14.6).133 Renal impairment8 (4.9)37 (2.9).164 Chronic obstructive pulmonary disease6 (3.7)41 (3.2).749 Asthma14 (8.6)89 (7.0).448 Liver impairment3 (1.9)48 (3.8).212Medications Proton pump inhibitor31 (19.1)235 (18.5).846 Nonsteroidal anti-inflammatory drug18 (11.1)176 (13.9).336 Laxative11 (6.8)27 (2.1).002 Antibiotics8 (4.9)18 (1.4).002 Ranitidine3 (1.9)15 (1.2).471 Clopidogrel5 (3.1)24 (1.9).309 Blood thinnerbBlood thinners included warfarin, therapeutic doses of low-molecular-weight heparin, and direct oral anticoagulants.14 (8.6)44 (3.5).002 ImmunosuppressantscImmunosuppressants included steroids, biologics, and any other immunosuppressive disease-modifying agents. Patients were not taking chemotherapy at the time of breath testing.10 (6.2)54 (4.3).265Patient-reported symptoms (n = 855: 146 CRC patients and 709 control subjects), Bowel symptomsdDefinition of bowel symptoms was change in bowel habit, diarrhea, fecal urgency, or constipation.67409.021% of total study patients Weight loss2867.264 Abdominal pain40152.737 Rectal bleeding55232.918 Other symptomseDefinition of other symptoms was broad: dyspepsia, bloating, rectal pain, flatulence, rectal itching, and anemia. Note: The true number of anemic patients was unknown, because hemoglobin levels were not recorded.2379.170Tumor site Left sided: rectum to splenic flexure100 (61.7)NANA Right sided: transverse colon to cecum62 (38.3)NANAClinical tumor stage 120 (12.3)NANA 236 (22.2)NANA 363 (38.9)NANA 441 (25.3)NANA Missing data2 (1.2)NANAClinical nodal stage 089 (54.9)NANA 149 (30.2)NANA 221 (13.0)NANA 31 (0.6)NANA Missing data2 (1.2)NANAClinical metastasis stage 0138 (85.2)NANA 122 (13.6)NANA Missing data2 (1.2)NANADifferentiation of tumor Well differentiated1 (0.6)NANA Moderately differentiated129 (79.6)NANA Poorly differentiated26 (16.0)NANA Missing data6 (3.7)NANAValues are n (%) unless otherwise defined. NA, not applicable.a Fourteen of these patients had >1 previous cancer.b Blood thinners included warfarin, therapeutic doses of low-molecular-weight heparin, and direct oral anticoagulants.c Immunosuppressants included steroids, biologics, and any other immunosuppressive disease-modifying agents. Patients were not taking chemotherapy at the time of breath testing.d Definition of bowel symptoms was change in bowel habit, diarrhea, fecal urgency, or constipation.e Definition of other symptoms was broad: dyspepsia, bloating, rectal pain, flatulence, rectal itching, and anemia. Note: The true number of anemic patients was unknown, because hemoglobin levels were not recorded. Open table in a new tab Supplementary Table 2Top 15 Features (Endogenous-only VOCs and Clinical Features) Capable of Differentiating CRC Patients From Control Patients in Model 1 (CRC vs Non-CRC [n = 1432, CRC = 162]) and Model 2 (CRC vs Non-CRC Symptomatic Patients Only [n = 855, CRC = 146])FeatureaFeatures are listed in order of importance to the predictive model.Retention time (min)Proportion of deconvoluted peak (%)Random forest scoreTentative identificationChemical Abstracts Service registrynumberProposed explanation/mechanismModel 1: CRC (n = 162) vs non-CRC (n = 1270) 122.02720.0102Propyl ester of propanoic acid (propyl propionate)106-36-5Ester metabolism has previously been linked to cancer, where 2 of the most-studied human carboxylesterase enzymes (CES1 and CES2) are markedly altered in cancerous tissue (PMID 30245959). Esters in breath have previously been found to be predictive for CRC (PMID:26212114 and PMID: 24820062). 29.27850.0100Dimethyl sulfide75-18-3Dimethyl sulfide is a plausible biomarker, as it is produced by anaerobic bacteria in the gut, with disruption of the microbiome in CRC as the potential mechanism for its increased production. Fecal dimethyl sulfide within a panel of 4 other VOCs has also been shown to discriminate high-risk polyp patients from control subjects (PMID: 26086914). Bacteria are also known to metabolize branched-chain alkanes (PMID: 4852318, PMID: 24829093) found to be characteristic of CRC in the current study. 317.48150.00361-Penten-3-ol616-25-1Increased alcohol abundance in CRC breath may be explained by the “Warburg” effect whereby glycolytic metabolism increases due to the increased proliferation characteristic of cancer cells. This leads to a shift toward anaerobic respiration of glucose. A wide variety of alcohols are endogenous and are found in the breath of humans, making this a plausible biomarker (PMID: 24421258). The human liver and gastrointestinal tract also contain alcohol dehydrogenase enzymes (PMID: 8244116), which would affect alcohol abundance in any blood returning from the gastrointestinal tract through the liver. 431.14330.00383,4-Dimethyl- 1,5-cyclooctadiene21284-05-9Dienes were found in the breath of healthy humans (PMID: 24421258) and therefore could be a plausible biomarker to be altered in disease. 5NANA0.0062BMI, metadataNAHigh BMI is a known risk factor for development of CRC (Cancer Research UK data). However, late-stage cancers of all types can also lead to low BMI due to the increased catabolism associated with severe illness. With regards to breath specifically, BMI is known to affect breath acetone levels (PMID: 29516396, PMID: 21725144), although neither acetone nor any other ketone was found to be discriminatory for CRC in the machine learning prediction model. 632.69580.00292-Propenyl ester of acetic acid (allyl acetate)591-87-7See ester explanation for Feature 1 740.12880.0052Branched tetradecaneUnknownMany alkanes, methylated alkanes, and cycloalkanes have been named as discriminatory markers for CRC (PMID 26212621, PMID 26212114, PMID: 30796770). Alkanes may be present in breath due to oxidative stress and lipid peroxidation that occurs in cancer and inflammation. (PMID: 18465793) There is also a link with microbial activity and branched alkanes; bacteria are known to be able to metabolize branched-chain alkanes (PMID: 4852318, PMID: 24829093). Altered gut microbiome in CRC could therefore potentially alter the branched alkane abundance. 823.52290.0036Overlapping ester similar to 2-propenyl ester of acetic acid (allyl acetate)591-87-7See ester explanation for Feature 1 910.411000.00132-Methyl- 2-propanol75-65-0See alcohol explanation for Feature 3 1032.24850.00194-Ethyl-1-octyn-3-ol (branched alcohol, molecular weight 130 or 152)5877-42-9See alcohol explanation for Feature 3 1131.69620.00272,2,4-Trimethyl-3-pentanol5162-48-1See alcohol explanation for Feature 3 124.751000.0039Cyclopropane75-19-4Cyclopropane has been documented in the breath and feces of healthy humans (PMID: 24421258). Cyclopropane has also been found in higher concentrations in the breath of breast cancer patients compared with healthy control subjects (PMID: 21383471). 1311.67120.00422-Ethoxypropane625-54-7See alkane explanation for Feature 7 1440.52860.00402-Phenoxy-ethanol122-99-6See alcohol explanation for Feature 3 1515.48650.0003Heptane142-82-5See alkane explanation for Feature 7Model 2: Symptomatic CRC (n = 146) vs symptomatic non-CRC (n = 709) 122.02720.0193Propyl ester of propanoic acid (propyl propionate)106-36-5See above explanation 29.27850.0110Dimethyl sulfide75-18-3See above explanation 317.48150.00381-Penten-3-ol616-25-1See above explanation 440.12880.0054Branched tetradecane629-59-4See above explanation 531.14330.00313,4-Dimethyl- 1,5-cyclooctadiene21284-05-9See above explanation 64.751000.0040Cyclopropane75-19-4See above explanation 723.52290.0025Overlapping ester similar to 2-propenyl ester of acetic acid (allyl acetate)591-87-7See above explanation 832.69580.00082-Propenyl ester of acetic acid (allyl acetate)591-87-7See above explanation 9NANA0.0028BMI, metadataNASee above explanation 1040.52860.00332-Phenoxy-ethanol122-99-6See above explanation 1111.67120.00402-Ethoxypropane625-54-7See above explanation 1238.74580.0014Branched tridecane629-50-5See alkane explanation for Feature 7 under Model 1 1331.69620.00262,2,4-Trimethyl-3-pentanol5162-48-1See above explanation 1432.24850.00064-Ethyl-1-octyn-3-ol (branched alcohol, molecular weight 130 or 152)5877-42-9See above explanation 1510.411000.00092-Methyl- 2-propanol75-65-0See above explanationFeatures are ranked according to random forest order of discriminatory value. Random forest feature scorings usually should add to 1 when all features that contributed to the model are included. However, the table demonstrates the random forest score for the top 15 most promising and realistic biomarker candidates only (endogenous only) and is the reason the random forest scores depicted here do not sum to 1. The numerical values of the scores can only be interpreted in relation to the other feature scores, where true important features should have high scores relative to the noise features.a Features are listed in order of importance to the predictive model. Open table in a new tab Values are n (%) unless otherwise defined. NA, not applicable. Features are ranked according to random forest order of discriminatory value. Random forest feature scorings usually should add to 1 when all features that contributed to the model are included. However, the table demonstrates the random forest score for the top 15 most promising and realistic biomarker candidates only (endogenous only) and is the reason the random forest scores depicted here do not sum to 1. The numerical values of the scores can only be interpreted in relation to the other feature scores, where true important features should have high scores relative to the noise features.